21. Assessing Summary

Assess: Summary

Assessing is the second step in the data wrangling process:

  • Gather
  • Assess
  • Clean

You can assess data for:

  • Quality: issues with content. Low quality data is also known as dirty data.
  • Tidiness: issues with structure that prevent easy analysis. Untidy data is also known as messy data. Tidy data requirements:
    1. Each variable forms a column.
    2. Each observation forms a row.
    3. Each type of observational unit forms a table.

…using two types of assessment:

  • Visual assessment: scrolling through the data in your preferred software application (Google Sheets, Excel, a text editor, etc.).
  • Programmatic assessment: using code to view specific portions and summaries of the data (pandas' head , tail , and info methods, for example).